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Abstract. This comment emphasizes the importance of model check- 
ing and model fitting when making inferences about finite population 
quantities. It also suggests the value of using unit level models when 
making inferences for small subpopulations, that is, "small area" anal- 
yses. 

Key words and phrases: Diagnostics, hierarchical structure, model 
checking, model fitting, small area statistics, unit level models. 



cn 

od 
o 



Professor Rao has written an excellent review of 
the alternative methods of making inference for fi- 
nite population quantities. This is an underserved 
field of research and, hopefully, this paper will en- 
courage some readers to make contributions to this 
important, practical area. 

Rather than commenting on detailed aspects of 
the paper, I will discuss two broad areas. Both are 
treated briefly in this article, but have not been con- 
sidered in the survey sampling literature as fully as 
I think they should be. The first is the fitting of 
models to complex survey data, and the second is 
model checking. 

Except for the design-based approach, all of the 
inferential methods described in this paper rely sig- 
nificantly on models. And, over the past thirty years 
great strides have been made to develop models that 
are consistent with observed data. My impression, 
though, is that survey statisticians have been slow to 
adopt these methodological advances. In Section 1 
Rao writes, referring to Hansen, Madow and Tep- 
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ping (1983), "Unfortunately, for large samples [mo- 
del dependent approaches] may perform very poorly 
under model misspecifications; even small model de- 
viations can cause serious problems." This example 
(in Hansen, Madow and Tepping, 1983) was ana- 
lyzed almost thirty years ago and only by the au- 
thors. One would hope that current methodology 
and skills in data analysis would provide an im- 
provement over the Hansen, Madow and Tepping 
(1983) "straw man," the usual ratio estimator. As 
noted by Hansen, Madow and Tepping (1983), one 
should use robust methods. But, there have been 
other advances in diagnostic techniques and infer- 
ential methods (e.g., model averaging). Moreover, 
this is a single example and, before drawing gen- 
eral conclusions, it would be preferable to consider 
this example again and analyze other examples typ- 
ical of sample survey data. Finally, though, it is im- 
portant to note that there are challenging problems 
in modeling data from complex sample surveys be- 
cause there may be several stages of cluster sam- 
pling, small sample sizes (typically in inconvenient 
places), possible selection biases, nonresponse and 
measurement errors. 

When the objective is inference for "small area" 
quantities there are special issues with modeling. 
In my experience almost all of the applications use 
an area-level model; see, for example, Section 5 of 
this paper and Rao (2003). (Moreover, there are 
many applications that are not reported in the ref- 
ereed literature, and I do not know of any that use 
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a unit-level model.) In a small area analysis one is 
concerned about the quality of the direct estima- 
tor, 6i, and, thus, uses a model that adds infor- 
mation about other small areas to improve infer- 
ence about Oi. Clearly, then, the quality of the esti- 
mated variance of 0j, v(0i), is even more question- 
able. (Rao notes this in Section 5, i.e., "the second 
assumption of known sampling variances is more 
problematic") Moreover, is it reasonable to assume 

that (§i — Oi)/ yj v{6i) is satisfactorily approximated 
by a standard normal distribution? A transforma- 
tion of Oi may be helpful. But, choosing the trans- 
formation and verifying that the associated stan- 
dardized quantity is approximately distributed as 
iV(0, 1) is a challenging exercise. There is a better 
way, though, and that is to model the unit level data 
as, for example, in Battese, Harter and Fuller (1988), 
Malec, Sedransk, Moriarity and LeClere (1997) and 
Malec (2005). Doing so has a second benefit. In 
such circumstances one can investigate alternative 
ways to make inference about the Oi from an area- 
level model (because the microdata are now avail- 
able and one can investigate sampling distributions 
of the transformed Oi 's) . 

Model checking is an essential part of the mode- 
ling process. In Section 5, Rao writes that "some of 
the default HB model-checking measures that are wi- 
dely used may not be necessarily good for detecting 
model deviations. For example, the commonly used 
posterior predictive p- value (PPP) for checking good- 
ness-of-fit may not be powerful enough to detect 
non-normality of random effects. . . because this mea- 
sure makes 'double use' of the data. . . ." There are 
methods that take care of this problem, for example, 
the partial PPP and conditional PPP (Bayarri and 
Berger, 2000), and the newer CPPP (Hjort, Dahl 
and Steinbakk, 2006). While these are computation- 
ally intensive, this should not be a major limita- 
tion in the current era. (See Ma, Sun and Sedransk, 
2010, for a recent implementation of CPPP.) I think, 
though, that there are other considerations that are 
probably even more important. First, choosing the 
appropriate test quantities to assess the fit of the 
currently entertained model is essential. And, this 
is difficult because an appropriate selection depends 
on guessing the nature of the aberration of the cur- 
rently entertained model from one that is closer to 
the one that generated the observed data. See, for 
example, Yan and Sedransk (2006, 2007, 2010) who 
investigated in detail the problem of detecting un- 
known hierarchical structure (e.g., fitting a model 
with a single stage when, in actuality, there are two 



stages). Moreover, is it important to detect rela- 
tively small discrepancies from the model currently 
being entertained? One may be requiring more "po- 
wer" than is warranted by the intended use of the 
data. Additionally, tests of goodness-of-fit are prob- 
lematic, especially in the frequentist paradigm since 
such tests are constructed to reject null hypotheses 
whereas one would like to accept a postulated model 
if the data are concordant with it. 

Finally, in Sections 4 and 5, Rao has discussed 
some applications of Bayesian methods to sample 
survey data. Sedransk (2008), referenced in Rao's 
paper, describes other areas where the use of Baye- 
sian techniques should be useful, and also points out 
some limitations. 
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